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Abstract 

Background: The efficacy of the CTL component of a future HIV-1 vaccine will depend on the induction of 
responses with the most potent antiviral activity and broad HLA class I restriction. However, current HIV vaccine 
designs are largely based on viral sequence alignments only, not incorporating experimental data on T cell 
function and specificity. 

Methods: Here, 950 untreated HIV-1 clade B or -C infected individuals were tested for responses to sets of 410 
overlapping peptides (OLP) spanning the entire HIV-1 proteome. For each OLP, a "protective ratio" (PR) was 
calculated as the ratio of median viral loads (VL) between OLP non-responders and responders. 

Results: For both clades, there was a negative relationship between the PR and the entropy of the OLP sequence. 
There was also a significant additive effect of multiple responses to beneficial OLP. Responses to beneficial OLP 
were of significantly higher functional avidity than responses to non-beneficial OLP. They also had superior in-vitro 
antiviral activities and, importantly, were at least as predictive of individuals' viral loads than their HLA class I 
genotypes. 

Conclusions: The data thus identify immunogen sequence candidates for HIV and provide an approach for T cell 
immunogen design applicable to other viral infections. 

Keywords: HIV specific CTL, clade B, clade C, HLA, vaccine immunogen design, functional avidity, epitope, entropy, 
immune correlate 



Background 

HIV-1 infection induces strong and broadly directed 
HLA class I restricted T cell responses for which speci- 
fic epitopes and restricting HLA class I alleles have been 
associated with relative in vivo viral control [1]. The 
bulk of the anti-viral CTL response appears to be dis- 
proportionately HLA-B restricted, but the relative con- 
tribution of targeted viral regions and restricting HLA 
molecules on the effectiveness of these responses 
remains unclear [2-5]. In addition, the impact of HIV-1 
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sequence diversity on the effectiveness of virus-specific 
T cell immunity in vivo is unclear, as functional con- 
straints of escape variants, codon-usage at individual 
protein positions, T cell receptor (TCR) plasticity and 
functional avidity and cross-reactivity potential may all 
contribute to the overall antiviral activity of a specific T 
cell response [6-13]. Of note, T cell responses to Gag 
have most consistently been associated with reduced 
viral loads in both clade B and clade C infected cohorts 
[14-16]; however, the specific regions in Gag responsible 
for this effective control remain poorly defined. In addi- 
tion, it is unclear whether the relative benefit of Gag is 
due to any other specific characteristic of this protein, 
such as rapid antigen-representation upon infection. 
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protein expression levels, amino acid composition and/ 
or inherently greater processability and immunogenicity, 
particularly in the context of selected HLA class I alleles 
[17,18]. Thus, concerns remain that a purely Gag-based 
vaccine might mainly benefit those people with a parti- 
cular HLA genotype and will not take advantage of 
potentially beneficial targets outside of Gag [4,16,17,19]. 
In addition, CTL escape and viral fitness studies have 
focused largely on Gag-derived epitopes presented in 
the context of protective HLA class I alleles such as 
HLA-B27 and -B57 [7,20,21], yielding results that may 
not be generalizable to the genetically diverse majority 
of the human population. Furthermore, many studies 
have focused on immunodominant targets only, despite 
some studies in HIV-1 and SIV infection demonstrating 
a crucial contribution of sub-dominant responses to tar- 
gets outside of Gag to the effective in-vivo viral control 
[4,22]. Thus, the current view on what may constitute a 
protective cellular immune response to HIV-1 is likely 
biased towards a immunodominant responses and those 
restricted by frequent HLA class I alleles and HLA 
alleles associated with superior disease outcome. 

To overcome these potential limitations, the design of 
an effective and broadly applicable HIV-1 vaccine 
should to be based on information gained through com- 
prehensive analyses that extend across large portions of 
the population's HLA class I heterogeneity. Here we 
focus on three cohorts totaling more than 950 
untreated, chronically HIV-1 infected individuals with 
clade B and C infections, from which responses to cer- 
tain regions of the viral genome and specific T cell 
response patterns emerge as correlates of viral control. 
Importantly, the analyses identify functional properties 
unique to these responses and control for the impact of 
HLA class I alleles known to be associated with superior 
control of HIV-1 infection, thus providing vaccine 
immunogen sequence candidates with potential useful- 
ness in a broadly applicable HIV-1 vaccine. 

Methods 

Cohorts 

A HIV clade B infected cohort of 223 chronically 
infected, treatment naive individuals was recruited and 
tested at IMPACTA in Lima, Peru. The majority (78%) 
of enrollees were male and all recruited individuals con- 
sidered themselves to be of a mixed Amerindian ethni- 
city [14]. The cohort had a median viral load 37,237 
copies/ml (range < 50- > 750,000) and a median CD4 
count of 385 cell/ul (rangel70-1151). A second clade B 
infected cohort was established at the HIV-1 outpatient 
clinic "Lluita contra la SID A" at Hospital Germans Trias 
i Pujol in Badalona (Barcelona, Spain) consisting of 48 
treatment-naive subjects with viral loads below 10,000 
and CD4 cell counts > 350 cells/mm^ ("controllers", n = 



24) or above 50,000 copies/ml and CD4 cell counts < 
350 cells/mm^ ("non-controllers", n = 24). The HIV-1 
clade C infected cohort has been described in the past 
and consisted of 631 treatment naive South African with 
a median viral load of 37,900 copies/ml (range < 50-> 
750,000) and a median CD4 count of 393 cells/ul (range 
1-1378) [16]. An additional 78 from a recently published 
cohort in Boston were included in the analyses of func- 
tional avidities [23-29]. HLA typing was performed as 
previously described using SSP-PCR [30]. For Hepitope 
and PASS analyses, 4digit typing was used for the Lima 
cohort and 2-digit typing for the Durban cohort. Proto- 
cols were approved in Lima by the IMPACTA Human 
Research Committee, in Durban by the Ethical Commit- 
tee of the Nelson R. Mandela School of Medicine at the 
University of KwaZulu-Natal and in Barcelona by the 
Human Research Committee at Hospital Germans Trias 
i Pujol. All subjects provided written informed consent. 

Peptide test set and ELISpot assay: Previously 
described peptide sets matching HLA-clade B and C 
consensus sequences were used in all experiments for 
which the OLP-specific entropies have been calculated 
in the past, based on available sequence datasets [31-33] 
and http://www.hiv.lanl.gov/content/immunology/hla- 
tem/index.html. The peptides were clade-specific sets of 
adapted 18mers, overlapping by 11 residues designed 
using the PeptGen tool available at the Los Alamos HIV 
database http://www.hiv.lanl.gov/content/sequence/ 
PEPTGEN/peptgen.html. The individual OLP in the 
peptide sets for clade B and clade C had all the same 
starting and ending position relative to the source pro- 
tein and follow the same numbering across the entire 
viral proteome for both clades. Peripheral blood mono- 
nuclear cells (PBMCs) were separated from whole blood 
by density centrifugation and used directly to test for 
CD8"^ T cell responses in vitro. IFN-y ELISpot assays 
were performed as described previously, using Mabtech 
antibodies (Mabtech, Stockholm, Sweden) and a matrix 
format that allowed simultaneous testing of all 410 over- 
lapping (OLP) peptides in the respective test set [14]. 
Thresholds for positive responses were defined as: 
exceeding 5 spots (50 SFC/10^) per well and exceeding 
the mean of negative wells plus 3 standard deviation or 
three times the mean of negative wells, whichever was 
higher. Stimulation with PHA was used as a positive 
control in all ELISpot assays. 

Definition of functional avidity 

Responses targeting 18 mer OLP in HIV-1 Gag p24 
were assessed for their functional avidity using OLP-spe- 
cific sets of 10 mer peptides overlapping by 9 residues 
that span the 18 mer peptide sequence. Functional avid- 
ity was defined as the peptide concentration needed to 
elicit half maximal response rates in the ELISpot assay 
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and was calculated as a sigmoidal dose response curve 
fit using GraphPad Prism software [13]. 

In vitro viral replication inhibition assay 

A double mutant virus containing a Nef M20A and 
Integrase G140S/Q148H Raltegravir (integrase inhibitor) 
resistance mutations was tested for replication in CD4 T 
cells in the presence or absence of autologous T cell 
lines targeting protective or non-protective OLP. Use of 
the Raltegravir-resistant virus allows to prevent potential 
replication of autologous virus in the inhibition assays 
[28], excludes potential negative impacts on antigen pro- 
cessing or CTL functions attributed to protease inhibi- 
tors [34] and avoids overlap between the resistance 
mutations sites (i.e. G140S/Q148H) and location of ben- 
eficial and non-beneficial OLP sequences. In brief, the 
p83-10 plasmid containing mutations for a methionine 
to alanine substitution at position 20 of the Nef protein 
and the p83-2 plasmid engineered to contain the G140S 
and Q148H mutations in the integrase were combined 
to produce a virus that is replication competent, highly 
resistant to Raltegravir and does not downregulate HLA 
class I in infected cells [35,36]. Although not entirely 
physiological, this approach was chosen to potentially 
increase the signal in the in vitro inhibition assay, even 
when responses were restricted by Nef-sensitive HLA 
class I alleles. Plasmids were co-transfected into MT4 
cells and virus was harvested after 7 days [35,37,38]. 
Autologous CD4 cells were enriched by magnetic beads 
isolation (Miltenyi) and expanded for 3 days using a bi- 
specific anti-CD3/8 antibody and IL-2 containing med- 
ium (50 lU r-IL2) before infecting them at multiplicities 
of infection (MOI) between 0.01 and 1. Effector cells 
were obtained by stimulating PBMC with either benefi- 
cial or non-beneficial OLP for 12 days before isolating 
specific OLP-reactive cells by IFN-y capture assay 
according to manufacturers' instructions (Miltenyi, Ber- 
gisch Gladbach, Germany). The effector T cells were 
analyzed by flow cytometry for the specificity to their 
respective targets after capture assay and quantified to 
adjust effector-to-target ratios. Since the NL4-3 back- 
bone sequence differed in several positions in beneficial 
and non-beneficial OLP, the epitope specificity was pre- 
dicted based on the HLA class I genotype of the tested 
individual and responses confirmed to efficiently recog- 
nize variant sequences in the NL4-3 backbone sequence. 
Culture supernatant was harvested and replaced by Ral- 
tegravir containing medium 0.05 (ig/ml after 72 h. 
Levels of Gagp24 in the culture supernatant were deter- 
mined by ELISA as described [39]. 

Statistical Analyses 

Statistical analyses were performed using Prism Version 
5 and R Statistical Language [40]. Results are presented 



as median values unless otherwise stated. Tests included 
ANOVA, non-parametric Mann-Whitney test (two- 
tailed) and Spearman rank test. The significance of dif- 
ferences in viral load distribution between OLP-respon- 
ders and OLP-non-responders was assessed by a two- 
sided Student's T Test with multiple tests addressed 
using, instead of a Bonferroni correction, a q-value 
approach to compensate for multiple comparisons [39]. 
The multivariate analysis was based on a novel multi- 
variate combined regression method known as PASS, a 
forward selection method combined with all-subsets 
regression [41-43]. Briefly, the PASS approach works by 
iteratively performing the following procedure: Let 'V 
be the set of all variables and 'M' be the set of variables 
included in a model. In the first step, those variables 
that are not already in the model are divided into equal- 
sized blocks of variables (the last block may have less 
than 'g' variables). Then, for each block of variables, m' 
is a new estimated and evaluated model using the Baye- 
sian Information Criterion (BIC). The best model 'm' 
according to its BIC is retained and the procedure starts 
all over again until in one step or more the model is not 
improved. 

Results 

HIV-1 -specific T cell responses targeting conserved 
regions are associated with lower viral loads 

In a first analysis, HIV-1 -specific T cell responses were 
assessed in a cohort of 223 HIV-1 clade B infected indi- 
viduals recruited in Lima, Peru using IFNg ELISpot 
assays and a previously described set of 410 clade B 
overlapping peptides (OLP) [14,31]. For each OLP, a 
protective ratio (PR) was calculated as the ratio of the 
median viral loads between OLP non-responders and 
OLP responders, such that OLP with PR > 1 were 
reflective of OLP predominantly targeted by individuals 
with reduced viral loads. OLP-specific PR were a) com- 
pared between OLP spanning the different viral proteins 
and b) correlated with the viral sequence heterogeneity 
in the region covered by the OLP. The data showed 
highest median PR values for OLP spanning the Gag 
protein sequence, whereas Nef, Env and Tat had the 
lowest median PR values (Figure lA, p < 0.0001, 
ANOVA). A protein-subunit-breakdown of PR values 
showed the pl5 subunit of Gag and RT in Pol to score 
less favorable than the remainder of the respective pro- 
teins (Figure IB, p = 0.0032 and p = 0.0025, respec- 
tively). While these data confirm the association 
between HIV-1 Gag-specific responses and lower viral 
loads, it is important to note that all proteins contained 
OLP with PR > 1, suggesting that some beneficial 
responses can be located outside of Gag; data that has 
not emerged from any of the previous studies linking 
Gag responses to relative viral control. At the same 
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Figure 1 Localization and conservation of beneficial and non-beneficial CLP in HIV-1 clade B and C cohorts. Total H IV- 1 -specific T cell 
responses were assessed in a cohort of 223 chronically HIV clade B infected, untreated individuals in Lima, Peru (graphs A-C) and in 631 
chronically HIV clade C infected, untreated individuals in Durban, South Africa (graphs D-F) using peptide test sets of 410 18 mer overlapping 
peptides (OLP) spanning the consensus B and C sequences, respectively [2,31]. For each OLP, the protective ratio (PR, defined as "the ratio of the 
log median viral load in OLP non-responders divided by log median viral load in OLP responders") was determined. Each symbol represents an 
individual OLP, grouped either by (A, D) proteins or (B, E) protein-subunits for OLP located in Gag, Pol and Env (p-values in A, D based on 
ANOVA, in B, E on Mann-Whitney by pariwise comparing the different protein subunits, red lines indicating median PR values). In (C and F), the 
OLP-specific entropy (a measure of the viral diversity in the region the OLP spans) is compared to the OLP-specific PR and shows an inverse 
association between the sequence conservation and PR (Spearman rank). 
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time, all proteins contained OLP with PR < 1, indicating 
that proteins considered overall beneficial may contain 
non-beneficial regions as well. In addition, when the 
OLP-specific PR was compared to the sequence entropy 
of the region spanned by the individual OLP, a signifi- 
cant negative correlation between PR and entropy was 
observed (p = 0.0028, r = -0.15; Figure IC). Although 
rarely targeted OLP may have introduced statistically 
less robust data points in this comparison and caused a 
wide scatter of data points, the results show a relative 
absence of OLP with high entropy and high PR values, 
suggesting that responses to more variable regions are 
less effective in mediating in vivo viral control. 

To assess whether the above observations would also 
hold true outside of clade B infection, the same analyses 
were conducted in a cohort of 631 clade C HIV-1 
infected subjects enrolled in Durban, South Africa and 
tested for responses against a clade C consensus OLP 
sequence as described previously [33]. As in clade B 
infection, the OLP specific PR values were highest for 
OLP spanning Gag without any significant differences 
between the Gag and Pol protein subunits (Figure ID 
and IE). As in the clade B cohort, the PR values were 
negatively correlated with the OLP-specific entropy (p = 
0.0323, Figure IF), confirming the findings in the clade 
B cohort and further pointing towards the importance 
of targeting conserved segments of the viral proteome 
for effective in vivo viral control. 

Identification of individual beneficial OLP sequences in 
clade B and C infection 

In order to identify individual OLP that were signifi- 
cantly more frequently targeted in individuals with rela- 
tive viral control and to compare the beneficial OLP in 
clade B and C infection, the viral load distribution in 
OLP-responders and non-responders was analyzed indi- 
vidually for each OLP. For the clade B cohort in Peru, 
the analyses yielded 43 OLP sequences for which the 
median viral load differed between the two groups with 
an uncorrected p-value of < 0.05. Of these 43 OLP, 26 
were OLP with a PR > 1 (referred to as "beneficial" 
OLP), and 17 OLP with a PR < 1 ("non-beneficial" OLP, 
Table 1). The distribution of OLP with PR > 1 among 
viral proteins was biased towards Gag and Pol, while 
Env produced exclusively OLP with PR < 1 (Figure 2A). 

The same analyses were repeated for the clade C cohort 
in Durban, which due to its larger size allowed to apply 
more stringent statistical criteria to identify beneficial 
and non-beneficial OLP. To compensate for multiple sta- 
tistical comparisons, we employed a previously described 
false-discovery rate approach [39], resulting in the identi- 
fication of 33 clade C OLP with q-values of < 0.2 (i.e. 
OLP with significantly different viral load distributions 
between OLP-responders and non-responders with a 



false positive discovery rate (q-value) of 20%). The 33 
OLP identified were comprised of 22 beneficial OLP and 
11 non-beneficial OLP, with the beneficial OLP being 
again located in Gag, Pol and Vif, similar to what was 
seen in the clade B cohort (Figure 2B). 

In both cohorts, the total breadth and magnitude of 
responses did not correlate with viral loads as reported 
for parts of these cohorts in the past [14,16]. The OLP 
with significant differences in median viral loads (43 
OLP in clade B and 33 OLP in clade C, Tables 1 and 2, 
respectively, i.e. "scoring OLP"), were more often tar- 
geted in their respective cohort than OLP that did not 
score with a significant difference in viral loads (p = 
0.0015 Lima; p < 0.0001 Durban). However, beneficial 
and non-beneficial OLP were equally frequently targeted 
in either cohort. Also, there was no difference in the 
median magnitude of the OLP-specific responses, 
regardless whether it was a beneficial, non-beneficial or 
not-scoring OLP (all p > 0.7, data not shown). Finally, 
there was no correlation between the number of total 
OLP responses (against all 410 OLP) and the magnitude 
of responses to beneficial OLP in either cohort, indicat- 
ing that the strength of beneficial OLP responses was 
not diminished by other responses to the rest of the 
viral proteome. 

In the clade B cohort, the 26 beneficial and 17 non- 
beneficial OLP showed a significant difference in their 
median entropy (p = 0.0327, Figure 2C), in line with the 
overall negative association between higher PR and 
lower sequence entropy seen in the comprehensive 
screening including the entire 410 OLP set (Figure IC). 
While this comparison was not significant in clade C 
infection, a detailed look at Gag showed that beneficial 
Gag clade C OLP had a lower entropy values than the 
rest of the Gag OLP, suggesting that targeting of the 
most conserved regions even in Gag provided particular 
benefits for viral control (Figure 2D, p = 0.0172). These 
beneficial OLP were also more frequently targeted 
(median of 36 responders) compared to the rest of Gag 
OLP (median 12 responders, p = 0.0099), likely reflect- 
ing the high epitope density in these regions [33,44]. 

Finally, the two cohorts showed a partial overlap in 
the targeted beneficial and non-beneficial OLP, despite 
the vastly different HLA genetics in these two popula- 
tions [4,31,45,46]. As Gag was enriched in beneficial 
OLP scattered throughout the entire protein sequence, 
we used the available reverse transcriptase (RT) protein 
structure to assess whether beneficial responses were 
targeting structurally related regions of the protein, even 
though the linear position of beneficial OLP did not 
precisely match between the two clades. Indeed, super- 
imposing the locations of beneficial OLP in the RT pro- 
tein indicates that in both clades, beneficial OLP fell in 
structurally related domains of the RT protein (Figure 
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Table 1 Beneficial and non-beneficial OLP identified in Lima clade B cohort (p < 0.05) 
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DT 
K 1 


CA^^IDI/\/l CI P\^^IP\I/A 

bAulKKVbrbUulUlvA 


1 loyUz 


32761 


0.891 


0.01 9 


269 


D^l 

rOl 


int 


TI/CI r^l/rMTI/ir^MCD\ A/V 

1 l\bbUi\UI 1 KlUNrKVYY 


6629 


35755 


1.192 


0.030 


270 


Pol 


Int 


TI/ir^MCD\ A/VDPiCDPiDI \A/ 

1 KlUNbKVYYKUbKUPbVV 


1 81 71 


37360 


1.073 


0.01 9 


271 


Pol 


Int 


VVDPiC DPiDI \A/l//^DAl/l 1 \A/ 

YYKUbKUPbVVKGPAKbbVV 


25939 


35755 


1.032 


0.043 


Z/D 


D^l 

rOI 


ini 


NIKUYUlvUIVlALiULJLVA 


DDZy 


35755 


1 . 1 7Z 


U.Uz 1 


279 


Vpr 




/^Dr^DCDVMC\Arn ci i cci 
GPUKbPYNbVV 1 bbbbbbb 


60222 


32650 


0.944 


0.042 


307 


Env 


Gpl 20 


nil MMMTMTTCCC(^Cl/N /ICl/ 

UbNNN 1 N M bbbGbKIVlbK 


1 7941 9 


341 1 7 


0.863 


0.044 


31 1 


Env 


Gpl20 


1 RDKVQKEYAbFYKbDW 


179419 


32871 


0.860 


0.008 


314 


Env 


Gpl 20 


YRblSCNTSVITQACPKV 


58206 


31273 


0.943 


0.008 


315 


Env 


Gpl 20 


SVITQACPKVSFEPIPIH 


61011 


32871 


0.944 


0.034 


320 


Env 


Gpl20 


TNVSWQCTHGIRPW 


341587 


34640 


0.820 


0.034 


355 


Env 


Gpl20 


VAPTKAKRRWQREKRAV 


161602 


34117 


0.870 


0.042 


399 


Env 


Gp41 


VIEWQRACRAIbHIPRR 


388089 


34640 


0.812 


0.026 


405 


Vif 




VKHHMYISGKAKGWFYRH 


16458 


37237 


1.084 


0.021 


406 


Vif 




GKAKGWFYRHHYESTHPR 


16458 


37237 


1.084 


0.022 


424 


Vif 




TKbTEDRWNKPQ^GHR 


10319 


36434 


1.137 


0.014 



* PR values in bold indicate PR > 1, i.e. OLP-responses seen more frequently in individuals with reduced viral loads 
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Figure 2 Genome distribution, entropy and RT localization of OLP with significant impact on viral loads in HIV-1 clade B and C 
infection: The distribution of OLP with significantly elevated or reduced PR across the viral proteome is shown in A) for clade B infection (cut- 
off uncorrected p-value of p < 0.05) and in B) for clade C infection (cut-off q < 0.2). The entropy of beneficial and non-beneficial clade B OLP is 
compared in C) while in D), the entropy of beneficial OLP in HIV clade C Gag is compared to the remainder of Gag OLP (p-values based on 
Mann Whitney, red lines indicating median sequence entropies). In E and F, protein structures for HIV-1 reverse transcriptase (Protein databank 
structure ID 3IG1) were loaded into the Los Alamos HIV Database "protein feature accent" tool http://www.hiv.lanl.gov/content/sequence/ 
PROWIS/html/protvis.html and locations of beneficial RT OLP identified in clade B (Table 1) and in clade C (Table 2) marked by red highlights. 
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Table 2 Beneficial and non-beneficial OLP identified in Durban clade C cohort (q < 0.2) 



OLP # 


Protein 


Sub-unit 


OLP clade C 
consensus 

3CC|UCl ICC 


Median viral 
load in OLP 


Median viral 
load in OLP 

IIUII IC3|JUIIUCl3 


Protective Ratio 
(PR)^ 


p-value 


Q-value 


3 


bag 


pl7 


LWnLKrbbKKHYIVlLKnL 


1 8,700 


45,1 00 


1.09 


0.0002 


O.OOOo 


6 


bag 


pl7 


ACDCI CDCAI MD/^I 1 

AoKLLLKrALNrbLL 


6,570 


44, 1 00 


1 .22 


0.0000 


0.0000 


7 


bag 


pl7 


CDCAI MD/^I 1 CTCC/^/^l/ 

LnrALNrbLLL 1 obbLIX 


5,270 


43,900 


1.25 


0.0000 


0.0000 


zz 


bag 


p24 


\AA/l/\/ICCI/A CCDC\/IDN/1C 


o,odU 


A 1 Qcn 
4z,ojU 


1 1 Q 

1. lo 


U.UUUU 


U.UUUU 


Zd 


bag 


p24 


bA 1 rUULN 1 IVILN 1 Vbbn 


24,450 


4j,zUU 


1 .UO 


U.UUz 1 


U.UzDD 


i(^ 
ZD 


bag 


pz4 


IN 1 IVILN 1 VbbniJAAIVlL^IVlLrx 


J,3 1 U 


oy,DUU 


1 .Z5 


U.UUd I 


U.U/00 


27 


bag 


p24 


CCIAC^ h h\ l\r\\ l\\ l/P\TIMCCA 

bbnUAAIVlUlVlLKU 1 INLLA 


9,71 5 


42,1 00 


1.16 


0.001 5 


0.01 70 


29 


bag 


p24 


AAC\A/P\DI UD\/UA(^DIA 

AALVVUKLnrVHAbrlA 


1 9,700 


40,900 


1.07 


0.0045 


0.0544 


31 


Gag 


p24 


1 A D/^^^^ /IDCDD/^CrM a 

lAPbUIVlKbrKbbUIA 


6,480 


38,950 


1.20 


0.0146 


0.1478 


33 


Gag 


p24 


cPiiAi^ — TTCTi r^cr^i A\^/^ /I 
bUlAb M b 1 LUtUIAVVIVl 


1 1 ,650 


40,900 


1.13 


0.0025 


0.031 8 


37 


Gag 


p24 


\A/III r'\ M /D^ /IVC D\ /CI 

VVIILbLNKIVKIVlYbrVbl 


9,360 


44, 1 00 


1.17 


0.0004 


0.001 8 


39 


Gag 


p24 


CM n\ii/r^/^ni/cncDn\v\ / 


2,630 


38,250 


1.34 


0.01 82 


0.1 838 


41 


Gag 


p24 


\A /n\Dcci/n~i D A cr^ AT/^n\\ / 
YVUKI-|-KI LKALUA 1 (JUV 


22,1 50 


44, 1 00 


1.07 


0.0020 


0.0263 


42 


Gag 


p24 


1 D A cr^ ATr^n\\ /l/M\ A/N /lTn\TI 

LKAbCJA 1 UUVKN VVIVI 1 U 1 L 


1 6,480 


40,900 


1.09 


0.0078 


0.0935 


55 


Gag 


pl5 


l_ll A DM/^D A DDl/l//^/^\ A/I/ 

HIAKNLKAPKKKbLVVK 


7,550 


39,700 


1.19 


0.0092 


0.1 047 


59 


Gag 


pl5 


Dr^AMCI /^l/l\A/DCI_ll//^D 

KU A N h Lb Kl VV P b H Kb K 


9,840 


42,200 


1.16 


0.0046 


0.0539 


60 


bag 


pl5 


r^l/l\A/DCUI/<^DD/^MCI r\CD 

bKIVVrbHKbKrbNrLUbn 


6,1 30 


39,700 


1 .21 


0.0066 


0.0799 


63 


bag 


pl5 


TA DD A CCCDCCCTTD A Dl/ 

1 ArrALorKrLL 1 1 rArK 


6,040 


38,950 


1 .21 


0.0093 


0.1 020 


1 1 6 


Tat 


Tat 


T\/r'\ /^ICV/^Dl/l/DDr^DDC 

1 KbLblbYbKKKKKUKKb 


1 09,000 


36,700 


0.91 


0.0033 


0.041 0 


1 /o 


D^l 

rOI 


DT 
K 1 


C\A/C\/P(I ^^IDUDA^^I WW 

r V V L VULb 1 r H r Ab Ll\l\l\l\ 


ICQ nnn 


o/,oUU 


n QA 
U.o4 


U.UUoo 


U.U3o4 


1 81 


Pol 


RT 


1 n\\ //^n\ Avcc\ /Dl n\cniCDi/ 
LUVbUAYhDVPLUbUI-KK 


7,1 00 


38,950 


1.19 


0.01 86 


0.1 832 


1 90 


Pol 


RT 


D A r^M DCI\ /IVr^VN /I HiHil V\ / 

KAUNPtlVIYUYMUULYV 


84,900 


34,700 


0.92 


0.0043 


0.0555 


1 99 


D^l 

rOl 


DT 
K 1 


T\//^DI/^l DC1/P\C\A/T\/MP\I 

1 VUr lULrLKUbVV 1 VNUI 


6,700 


38,300 


1.20 


0.01 98 


0.1 926 


216 


D^l 

rOl 


DT 
K 1 


/^l/l A ^/lCCI\/l\A//^l/TDl/CD 

UKIAIVlLblVIVVbKI rKrK 


1 8,1 50 


43,000 


1.09 


0.0026 


0.031 7 


239 


Pol 


RT 


QVD KLVSSG 1 R KVLF L 


373,200 


37,700 


0.82 


0.0205 


0.1937 


253 


Pol 


Int 


PAETGQETAYFILKLAGR 


92,800 


35,400 


0.92 


0.0082 


0.0954 


265 


Pol 


Int 


AVFIHNFKRKGGIGGYSA 


63,650 


33,800 


0.94 


0.0178 


0.1826 


283 


Vpr 




GLGQYIYE^GDTWGV 


78,000 


35,600 


0.93 


0.0126 


0.1302 


284 


Vpr 




E^GD^A^GVEALIRIL 


85,050 


35,200 


0.92 


0.0099 


0.1034 


312 


Env 


Gpl20 


YALFYRLDIVPLNENNSSEY 


270,000 


37,700 


0.84 


0.0208 


0.1915 


365 


Env 


Gp41 


GIKQLQTRVLAIERYLK 


151,000 


34,700 


0.88 


0.0001 


0.0002 


393 


Env 


Gp41 


LLGRSSLRGLQRGWEALKYL 


750,000 


37,450 


0.78 


0.0007 


0.0041 


417 


Vif 




CFADSAIRKAILGHIV 


1,110 


38,200 


1.50 


0.0178 


0.1891 



* PR values in bold indicate PR > 1, i.e. OLP-responses seen more frequently in individuals with reduced viral loads 



2E and 2F). This suggests that despite differences in 
response patterns between ethnicities and clades, viruses 
from both clades may be vulnerable to responses target- 
ing the same structural regions of at least some of their 
viral proteins. 

Increased breadth of responses against beneficial OLP is 
associated with decreasing viral loads, independent of 
Gag-specificity or the presence of protective HLA class 1 
alleles 

To assess whether individuals targeting more than one 
beneficial OLP profit from a greater breadth of 
responses to these targets, subjects in both cohorts were 
stratified by the number of responses to beneficial OLP 
and their viral loads compared. In both cohorts, negative 



correlations between the number of responses to benefi- 
cial OLP and viral loads were observed (p < 0.0001, r = 
-0.33 for Lima; p < 0.0001, r = 0.-25 for Durban; data 
not shown), suggesting that there is a cumulative benefit 
of responses to these particularly effective targets. Simi- 
larly, when individuals in the clade C cohort were 
grouped based on mounting 1-2, 3-4 or five and more 
beneficial OLP responses, a gradual reduction in median 
viral loads was seen. This reduction was close to 20-fold 
when 5 or more of the 22 beneficial OLP were targeted 
(median viral load 5,210 copies/ml) compared to indivi- 
duals without a response (98,800 copies/ml. Figure 3A). 
Importantly, this observation was not driven only by 
individuals expressing HLA class I alleles associated 
with relative control of viral replication (including HLA- 
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Controllers Non-controllers 
Figure 3 Increased breadth of responses to beneficial OLP results in gradually reduced viral loads and is independent of cohort and 
HLA-B27, -57, -B58, -B81 and -B63. (A) The number of responses to beneficial OLP in tine clade C coliort in Durban was determined for eacli 
individual and compared to viral loads. An increased breadth of responses to the 22 beneficial OLP was associated with reduced viral loads 
(ANOVA, p < 0.0001). (B) This association remained equally stable after removing all individuals expressing known beneficial HLA allele (HLA-B27, 
-B57, -B5801, -863, -B81) from the analysis (ANOVA, p < 0.0001). (C) The set of 26 beneficial and 17 non-beneficial OLP identified in the clade B 
infected cohort in Lima, Peru was tested in a second clade B infected cohort in Barcelona. HIV controllers showed a significantly higher focus of 
responses on the 22 beneficial OLP (61% of all responses to the 43 OLP) while non-controllers reacted predominantly with the non-beneficial 
OLP (only 29% of all responses targeting beneficial OLP). The Barcelona cohort did not included subject expressing any HLA allele previously 
associated with relative control of HIV-1 (p = 0.001 1, Mann Whitney). 
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B27, -B57, -B^^5801, -B63 and -B81) as their exclusion 
still showed a strong association between increased 
breadth of responses to beneficial OLP and a gradual 
suppression of viremia (Figure 3B). This was further 
supported when translating the clade B data from Peru 
to a second clade B infected cohort in Barcelona, Spain 
where HIV-1 controllers also mounted a significantly 
greater proportion of their responses to the beneficial 
Peruvian OLP compared to the HIV-1 non-controllers 
(61% vs. 29%, p = 0.0011; Figure 3C); this despite the 
fact that the Barcelona cohort was genetically different 
and excluded individuals expressing HLA-B27, -B57, 
-B58 and B63. Thus, despite the frequent targeting of 
Gag and the inclusion of individuals expressing HLA 
alleles such as HLA-B^^5701 and -B^^SSOl in the two lar- 
ger clade B and C cohorts, the present data identify 
regions of the viral genome that serve as the targets of 
an effective host T cell response, largely independent of 
the presence of HLA alleles known to influence HIV-1 
viral replication. 

PR-values are mediated by individuals with broad HLA 
heterogeneity 

To further assess the contribution of specific HLA class 
I alleles on the PR of individual OLP, the statistically 
significant OLP in the clade C cohort were further ana- 
lyzed. In a first step, median viral loads in the OLP- 
responder and non-responder groups were compared 
after excluding individuals with specific HLA class I 
alleles. If the statistical significance of the comparison 
was lost, the excluded HLA class I allele was assumed 
to have significantly contributed to the initially observed 
elevated or reduced PR value and to restrict a potential 
CTL epitope in that OLP. In a second step, a "Hepitope" 
analyses http://www.hiv.lanl.gov/content/immunology/ 
hepitopes was conducted to identify HLA class I alleles 
overrepresented in the OLP responder group; providing 
an alternative approach to identify specific epitopes that 
may contribute to relative viral control. Together, the 
two strategies permit to estimate the HLA diversity in 
the OLP responders and to identify the most likely 
alleles that restrict the epitope-specific responses to the 
OLP. Both are important measures when determining 
the relative usefulness of a selected beneficial OLP in a 
potential immunogen sequence as it should provide 
broad HLA coverage. The data from these analyses are 
summarized for beneficial and non-beneficial OLP in 
Table 3 and 3, respectively. The results demonstrate 
that with a few exceptions, for each OLP, several HLA 
alleles appeared to be mediating the observed effects as 
their removal caused the statistical significance to be 
lost. However, for the most frequent HLA class I alleles, 
the loss of significance may be due to a reduction in 
sample size rather than the actual allele, since the 



exclusion of many allele carriers could reduce the num- 
ber of OLP responders (and non-responders) sufficiently 
to lose statistical power. The "Hepitope" analysis con- 
trolled for this effect and confirmed the obtained results, 
strongly indicating that responses to beneficial OLP 
were mediated by responder populations with heteroge- 
neous HLA allele distributions. 

Effects of T cell specificity on in vivo viral load are at 
least as strong as those associated with host HLA 
genetics 

To assess whether specific response patterns and/or 
HLA combinations could be identified that mediated 
synergistic or superior control of viral infection in clades 
B and C, multivariate combined regression analysis was 
conducted on either OLP only, HLA only or the combi- 
nation of OLP and HLA variables [41-43]. The OLP- 
only analysis for Lima identified 7 OLP of which 4 were 
associated with lower median viral loads and 3 with 
increases in viral loads, respectively (Table 4). Targeting 
at least one of these beneficial clade B OLP was asso- 
ciated with significantly reduced viral loads (median 11, 
079 copies/ml) compared to the subjects who did not 
target any of these four OLP (median 52, 178 copies/ml; 
p < 0.0001, Figure 4A). As seen in the univariate analy- 
sis (Figure 2C), the four beneficial OLP emerging from 
the Lima FASS analysis were more conserved than the 
rest of the OLP (median entropy 0.0759 vs. 0.1649, p = 
0.0267) or the three non-beneficial OLP (0.0759 vs. 
0.1228, p = 0.0571, data not shown). In contrast to 
OLP-only FASS analysis, only one HLA allele (HLA- 
C04) emerged from the HLA-only multivariate analysis. 
The analysis for the combined variables (OLP and HLA) 
controlled for the potential bias in this result due to 
more OLP variables (n = 389) than HLA (n = 146) 
being included in the statistical tests; yet still identified 
more OLP variables (n = 9) than HLA class I alleles (n 
= 3). In addition, the relative co-efficients of these asso- 
ciations were stronger for the OLP than the HLA vari- 
ables, suggesting that T cell specificity influenced viral 
loads to at least the same degree as host HLA class I 
genetics. Of note, the identified OLP and HLA variables 
did not reflect responses to known optimal CTL epi- 
topes, as none of the OLP contained described epitope 
(s) restricted by any of the identified HLA alleles [44]. 

Results from the clade C cohort in Durban confirmed 
the clade B findings in Lima as the FASS analyses iden- 
tified 16 OLP but only 8 HLA variables that had an 
impact on the individual viral loads. As in Lima, the 
impact of OLP specificity was at least as strong than 
HLA genotype (trend for higher coefficients for OLP 
than HLA; data not shown, p > 0.05). In addition, tar- 
geting at least one of the eight beneficial OLP in Durban 
was associated with strongly reduced viral loads (p < 
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Table 3 Impact of HLA alleles on the statistical significance of observed PR values (clade C OLP) 

A) Beneficial OLP (PR > 1) 

OLP Protein PR Removed HLA allele(s) abolishing statistical Alleles over-represented in the OLP 
significance^ responder group^ 

3 Gag 1 .09 ASO, B42, CI 7 A30, BOS, AOS, A74, CI 7, A4S, B42, B07 

6 Gag 1.22 B15 B49, BS2, CI 4 

7 Gag 1.25 - B42, CI 7, B49, ASO 

22 Gag 1.18 B57, C07 B57, A74, B45, C07, CI 6, BIS 

25 Gag 1.06 MO, B15, C04, C07 B42, C17, B81, BS9, AOl, C12, CIS, ASO, B67 

26 Gag 1.2S A02, A23, A68, B07, BM, B58, C07, COS COS, B15, A68 

27 Gag 1.16 Bl 5, C07 B15, A68, COS, COS 

29 Gag 1.07 A68, B15, B58, C02, C03, C06, 0 2 BS5, BS9, CI 2, B40, B07, C04 

31 Gag 1 .2 A02, All, A23, A29, /\32, A34^ A68, B07, Bl S, A29, C06, Al 1 

BIS, B15, B42, B44, B58, C04, C06, C07, CI 7 
33 Gag 1 .1 3 A02, A23, B44, B57, B58, C07 B58, B57, A02, C07, COS, A68 

37 Gag 1.17 /\3a B42, C17 CIS, B42, C17, AOl, B81 

39 Gag 1.34 A02, A03, A23, A29, A30, A68, A74, BOB, B15, A02 

BIS, B42, B45, B5S, B57, B5S, C02, COS, C06, 

C07, COS, CI 6, CI 7 

41 Gag 1.07 A23, C06 COS, B14, A6S, COS, B15 

42 Gag 1.09 A23, A30, BOS, Bl 5, B42, B58, C03, C04, C07 B5S, COS 

55 Gag 1.19 A02, A24, A29, A30, B07, B15, B39, M2, B44, B42, BOS, C17 

B5S, C02, C06, C07, CI 7 

59 Gag 1.16 A02, A30, BOS, B42, B44, B5S, C04, C07, CI 7 A02, BIS, A29 

60 Gag 1 .21 A02, A30, B42, B5S, C06, C07, CI 7 A02, B41 , C07, CI 7 
63 Gag 1.21 A02, A2S, A29, A30, A6S, BOS, B15, B44, B5S, A2S 

C02, COS, C06, C07 

181 Pol 1.19 AOl, A23, A29, A30, A34, A6S, A74, B14, B15, B57, CIS 

BIS, BS5, B44, B45, B57, B5S, C02, COS, C04, 
C06, C07, COS, CI 6 

1 99 Pol 1 .2 A02, A03, A2S, A24, A26, A30, A3 1, A34, A36, A66, A6S, B5S, A2S, C04 

ASO, BOS, BIS, B15, BIS, BS5, B40, B41, B42, B44, B45, 

B49, B50, B51, B5S, B57, B5S, BSl, COl, C02, COS, 
C04, 

COS, C06, C07, COS, CIS, C16, C17 
216 Pol 1.09 /\02, /\3a B5S, C07, C/7 B5S, B5S, C07, B57 

417 Vif 1.5 AOS, A23, ASO, A34, AS6, A6S, BOS, B14, BIS, B14, COS, AS6 

B44, B5S, B5S, COS, C04, C06, COS 

B) Non-beneficial OLP (PR 

< 1) 

116 Tat 0.91 A02, A34, B15, C04 B15, C02 

178 Pol 0.S4 AOS, A6S, 5/5, B5S, C04, C06, C07 A6S, C06, B5S, BS2, AOS 

190 Pol 0.92 AOS, ASO, A66, BIS, B42, B45, BSS, C06, C07 A02, BIS, BS5, C05, CI 6, B45, ASO, CI 2, B67, BS9 

239 Pol 0.S2 - C05, AOS 

253 Pol 0.92 AOS, A68,B]5, BS9, B42, B44, BSS, C02, C04, C06, COS, A6S, COS, Bl 5, B07, CI 5, B41 

C17, CIS 

265 Pol 0.94 A02, AOS, A2S, A24, A26, A29, ASO, ASl, ASS, B15, C02, A4S, A74 

AS4, A66, A6S, B07, BOS, B14, BIS, B27, B40, B41, B42, 
B44, BSS, BS4, BSS, BS7, BSl, COl, C02, C04, C06, C07, 
C12, C17, CIS 

283 Vpr 0.93 A02, AOS, A2S, ASO, A66, A6S, A74, B07, Bl 4, BIS, BS9, A6S, COS, B07, CI 7, B41 

B41, B42, B4S, BS7, C02, C04, C07, COS, CIS, CI 7 

284 Vpr 0.92 AOS, A2S, ASO, A66, A6S, A74, B07, B14, BIS, BS9, B42, A6S, COS 

B4S, BS7, C07, COS, CIS, CI 7 
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Table 3 Impact of HLA alleles on the statistical significance of observed PR values (clade C OLP) (Continued) 

312 Env 0.84 - BOS, C07 

365 Env 0.88 B58, C06 C06, B58, A43, B45, CI 6, A66 

393 Env 0.78 A30, B58, C06 A31, C06, B45 

1) in Italics HLA alleles that do not emerge from the Hepitope analysis 2) cut-off in Hepitope analyses for p < 0.05, alleles sorted according to strength of 
association 



0.0001, Figure 4B). This effect was, as in the univariate 
analysis, additive for more than one response (p < 
0.0001, Figure 4C) and included OLP that were, aside 
from Gag, located in Pol and Vif. Also, the combined 
(OLP and HLA) analysis suggests the effect of OLP spe- 
cificity on viral loads to be at least as strong as HLA 
genetics as 8 OLP and 7 HLA variables were identified. 
This especially since among the 7 HLA alleles, two 
(HLA-B57 and HLA-A74) are expressed in linkage dise- 
quilibrium [47], further reducing the number of HLA 
variables with a significant impact on viral loads. 

Responses to beneficial OLP are of higher functional 
avidity and suppress viral replication in vitro more 
effective than responses to non-beneficial OLP 

Functional avidity and the ability to suppress in vitro 
viral replication have emerged as two potentially crucial 
parameters of an effective CTL response against HIV-1 
[23-29]. To assess this potential functional characteristic 
of beneficial CTL populations, we determined the func- 
tional avidity of responses to the four beneficial OLP 
located in Gag p24, a region that has been most consis- 
tently associated with eliciting relatively protective CTL 
responses. As 18 mer peptides are suboptimal test pep- 
tides to determine functional avidity, 10 mer overlapping 
peptide sets were synthesized to cover the four benefi- 
cial OLP and all detected responses were titrated. The 
SD50% was determined for a comparable numbers of 
responses detected in controllers (n = 21 responses) and 
non-controllers (n = 24 responses) and showed a statis- 
tically significant difference between the two groups 
(median 3, 448 ng/ml vs. 25, 924 ng/ml, p = 0.0051, Fig- 
ure 5A). This reduced avidity in HIV non-controllers to 
beneficial OLP could possibly explain why HIV-1 non- 
controllers did not control their in vivo viral replication 
despite targeting these regions in some instances and 
with responses of comparable magnitude as HIV con- 
trollers (278 SFC vs 305 SFC/10^ PBMC, p = 0.55, data 
not shown). 

To more directly assess whether responses to benefi- 
cial OLP were of particularly high functional avidity, 
regardless of HIV controller status, we determined 
SD50% of responses to 17 optimal epitopes from benefi- 
cial, neutral and non-beneficial OLP (Figure 5B). Med- 
ian epitope-specific SD50% were determined from an 
average of 7 titrations per epitope and compared to the 



OLP specific PR. A strongly significant, negative associa- 
tion between the PR and the SD50% was noted (p = 
0.002, r = -0.69), indicating that beneficial OLP are tar- 
geted by high-avidity responses. To control for inter- 
individual differences due to disease status and viral 
load, we identified 10 individuals who targeted optimal 
epitopes in beneficial and non-beneficial OLP and deter- 
mined their functional avidity. As in the cross-sectional 
analysis before, this matched comparisons showed in all 
cases a higher functional avidity for the epitopes located 
in the beneficial OLP compared to the responses target- 
ing non-beneficial OLP (Figure 5C, p = 0.0020). Lastly, 
to relate the higher functional avidity to potential super- 
ior anti-viral effects in vivo, the ability to inhibit in vitro 
viral replication was assessed in three individuals who 
mounted robust responses against both beneficial and 
non-beneficial OLP. The in vitro inhibition assay first 
developed by Yang et al [48], was modified so that the 
NL4-3 based test virus contained a single nucleotide 
mutation in Nef (M20A) that blocks the Nef-mediated 
down-regulation of HLA class I molecules as well as 
two mutations in the integrase gene that mediate Ralte- 
gravir-resistance to permit the suppression of potentially 
replicating autologous virus in the assay. Indeed, CTL 
specific for the beneficial OLP(s) were up to 2 logs 
more effective inhibiting viral replication than CTL tar- 
geting non-beneficial OLP (Figure 5D), in line with 
recent data demonstrating different suppressive ability 
of HIV-1 specific CTL populations targeting Gag and 
Env-derived epitopes [24]. Although the in vitro inhibi- 
tion assays were limited to few individuals with suitable 
response patterns, these data together with the results 
from the extensive titration assays in Figure 5B and 5C 
indicate that responses to beneficial OLP are of particu- 
larly high functional avidity and inhibit in vitro viral 
replication more effectively than responses to non-bene- 
ficial OLP. Of note, higher avidity responses to benefi- 
cial OLP compared to non-beneficial OLP were seen in 
all 10 tested individuals, ruling out that inter-individual 
variability in viral loads, duration of infection and HIV 
disease status could have biased the analyses. 

Conclusions 

Defining functional correlates of HIV-1 immune control 
is critical to the design of effective immunogens. T cell 
responses to specific HIV-1 proteins and protein- 



Mothe et ol. Journal of Translational Medicine 201 1, 9:208 
http://www.translational-medicine.eom/content/9/1/208 



Page 1 3 of 20 



Table 4 Multivariate analysis of OLP and HLA variables for clade B and C cohorts 
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Figure 4 Responses to OLP identified in multi-variate analysis are associated with reduced viral loads: Response patterns and HLA class I 

genetics in the clade B coliort in Lima and clade C coliort in Durban were subjected to PASS multivariat analysis [41-43]. Viral loads in 

individuals mounting zero vs. at least one response to beneficial OLP identified by the PASS multi-variate analysis were compared for (A) the 

Lima clade B cohort and the (B) Durban clade C cohort. The larger data set for the clade C cohort allowed for a further stratification of the 

responder group by increasing numbers of targeted OLP emerging from the PASS analysis (C). A gradually declining median viral load in relation 

to an increasing breadth of these responses was seen (ANOVA, p < 0.0001). 
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Figure 5 Responses to beneficial OLP are of higher functional avidity and suppress in vitro viral replication more effectively. (A) 

Responses to the four beneficial OLP located in HIV-1 clade B Gag p24 were retested using a peptide set of 10 mers overlapping by 9 residues. 
A total of 21 responses in HIV-1 controllers and 24 responses in HIV-1 non-controllers were titrated and the SD50% compared between the two 
groups, showing a significantly higher functional avidity in the controllers (p = 0.0051, Mann Whitney). (B) Responses to 17 different optimally 
defined CTL epitopes located in beneficial, neutral and non-beneficial OLP were titrated in samples from 78 HIV infected individuals with variable 
viral load and disease status. The median SD50% (ng/ml) was defined for each epitope and compared to the OLP-specific protective ratio 
(Spearman Rank test, p = 0.0020). (C) Ten individuals who mounted responses to well-defined optimal CTL epitopes located in beneficial as well 
as in non-beneficial clade B OLP were identified and their responses titrated. The SD50% for responses detected in the same individual were 
compared (Wilcoxon matched pairs test, p = 0.0039). (D) In-vitro viral replication inhibition assays [48] were performed using a Nef modified and 
Raltegravir resistant test virus and purified CTL effector populations from the same individual targeting beneficial and non-beneficial OLP. One 
representative experiment of three assays conducted in different individuals is show. Levels of Gag p24 were determined after 4 days of co- 
culture of effector cells and auologous CD4 T cells used as target cells. Target cells were stimulated 3 days prior with dual-specific anti-CD3/8 
mAb and infected at a MOI of 0.1. The negative control contained wells with target cells only ("no CD8"). 



subunits have been associated before with relatively 
superior viral control in vivo [14,16,49], but evidence 
from recent clinical trials suggests that including maxi- 
mal immunogen content into various vectors does not 
necessarily induce more effective CTL responses [50,51]. 
In fact, it has been argued that the existence of potential 
"decoy" epitopes may divert an effective CTL response 
towards variable and possibly less effective targets in the 
viral genome [52]. Thus, the definition of a minimal yet 
sufficient immunogen sequence that can elicit CTL 



responses in a broad HLA context is urgently needed. 
Thereby, focusing vaccine responses on conserved 
regions could help induce responses towards mutation- 
ally constrained targets and provide the basis for protec- 
tion from heterologous viral challenge. 

We present here the results of an extensive analysis 
that included more than 950 HIV-1 infected individuals 
with diverse HLA genotypes, from three different conti- 
nents and including clade B and C infections. In both, 
the analysis in clade B in Lima and clade C in Durban, 
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individual OLP were identified that are predominantly 
targeted by individuals with reduced or elevated viral 
loads, although the different size of the cohorts required 
different statistical approaches for their identification. In 
general, most of these OLP were among the more fre- 
quent targets in the HIV proteome, possibly due to 
both, the need for sizable responder groups to achieve 
statistical significance in the viral loads comparison as 
well as the high epitope density in these OLP, The iden- 
tified OLP were frequently located in HIV-1 Gag and 
Pol, but rarely in the more variable proteins such as Env 
and Nef. With one exception, Nef and Env featured only 
non-beneficial OLP, thus arguing against their inclusion, 
at least as full proteins, in a CTL immunogen sequence 
[16]. In addition, in both cohorts, the Vif protein yielded 
few, yet exclusively beneficial OLP, which may warrant a 
renewed look at the inclusion of regulatory proteins in 
vaccine design [53,54]. Also common to both clades, 
(and despite the wide scatter possibly due to the inclu- 
sion of less-frequently targeted OLP), an negative corre- 
lation between sequence entropy and PR was observed 
providing strong rationale for vaccine approaches that 
focus on conserved viral regions where T cell escape 
may be complicated by structural constrains [55]. This 
was particularly evident in the clade C cohort, where 
even within the relatively conserved Gag protein, a 
lower entropy was seen for the beneficial OLP compared 
to the remainder of the OLP spanning the protein. On 
the other hand, while beneficial and non-beneficial OLP 
showed a significant difference in their median entropy 
in the clade B cohort, this comparison was not signifi- 
cant in the clade C cohort. It is possible that the immu- 
nogen sequence, designed in 2001, did not optimally 
cover the circulating viral population in Durban 
throughout the enrollment period (until 2006), leading 
to missed responses particularly in the more variable 
segments of the virus [32,56]. The study may have thus 
failed to identify beneficial as well as non-beneficial 
OLP in the more variable genes of HIV. This should 
have preferentially affected highly variable OLP due to a 
more frequent mismatch between autologous viral 
sequence and in vitro test set in these regions. However, 
even if scoring as beneficial OLP, such high-entropy 
OLP may from an immunogen-design point of view be 
of less interest as they would possible contribute only 
little to protection from heterologous viral challenge. It 
needs however also to be considered that the OLP-spe- 
cific entropy values are based on variable numbers of 
sequences in the Los Alamos HIV database covering the 
different OLP, introducing potential further bias into 
these analyses, particularly for less covered proteins 
such as Vpu and other viral protein products. Such dif- 
ferences between autologous viral sequences and in 
vitro test sets may also have impacted the assessment of 



functional avidities. These determinations included 
responses in the same individual towards epitopes 
located in beneficial and non-beneficial OLP; with the 
former overall being more conserved. Thus, the higher 
functional avidity towards epitopes located in beneficial 
OLP could be biased by the higher chance that these 
epitopes matched the autologous viral sequence com- 
pared to epitopes located in non-beneficial OLP and 
which may thus have induced a more robust, avid 
response. Apart from covering autologous sequences, 
future studies will ideally also include comparable ana- 
lyses in individuals identified and tested in acute infec- 
tion that go on to control the infection at undetectable 
levels of viral replication (i.e. elite-controllers) so that 
the selective early emergence of responses to beneficial 
OLP could be linked to relative control of viral replica- 
tion in chronic infection. As is, the identified beneficial 
responses may be particularly important to maintain low 
viral replication in chronic stages of infection, which in 
theory could be different (for instance due to more 
accelerated intra-individual viral evolution in variable 
genes) from responses determining viral set point during 
acute infection. However, the existing HLA bias in such 
cohorts and the small number of responses identified 
during earliest stages of infection may make such ana- 
lyses a formidable undertaking that will require large 
numbers of individuals to be tested longitudinally. 

A broadly applicable T cell immunogen sequence 
should include T cell targets restricted by a wide array 
of HLA class I alleles. Although broad representation of 
HLA-B alleles may be particularly important in this 
regard, emerging data on the effects HLA-C alleles in 
these cohorts may warrant a broad HLA-C representa- 
tion as well [2,47,57]. In the present study, the 26 bene- 
ficial OLP from Lima and the 22 beneficial OLP from 
Durban covered 26 described, optimally defined CTL 
epitopes restricted by 20 different HLA alleles for the 
clade B cohort and 33 epitopes presented by 34 alleles 
for the clade C cohort, respectively [44]. As this is likely 
to be an underestimate of the true diversity in HLA 
restriction (Table 2 and ref [58]), it is reasonable to pre- 
dict that the inclusion of identified beneficial OLP, or 
even a subset thereof, could evoke potential responses 
in a widely diverse HLA context. This could also pro- 
vide the basis for the induction of poly-specific T cell 
responses with increased breath, which the present data 
clearly associates with progressively lower viral loads 
and which emerge as a potentially important parameter 
from several recent vaccine studies showing superior 
protection from SIV challenge in animals with a broad 
vaccine induced responses to Gag pl7 [59,60]. 

Recent studies have suggested a global adaptation of 
HIV-1 to its various host ethnicities [4,46]. The conse- 
quence of such adaptation has led in some cases to the 
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elimination of protective CTL targets, causing a pro- 
found absence of responses to these epitopes and detri- 
mentally changing the association between HLA allele 
and HIV-1 disease outcome [4]. It is thus not surprising 
that the two main cohorts tested here yielded only par- 
tially overlapping sets of beneficial OLP as the impact of 
host genetics and viral evolution in the studied popula- 
tions cannot readily be overcome. In fact, given studies 
by Frahm et al [4], the past and current adaptation of 
HIV-1 to common HLA class I alleles will likely still call 
for somewhat population tailored vaccine approaches, 
especially if the immunogen sequences should be kept 
short to avoid regions of potentially reduced immunolo- 
gical value [52]. Such approaches will also profit from 
more extensive structural analyses that may identify spe- 
cific domains of viral proteins that are or are not 
enriched in valuable T cell targets; of which the latter 
could possibly be ignored for the design of T cell immu- 
nogen sequences. Additional analyses in other geneti- 
cally unrelated cohorts of HIV-1 infected individuals 
and studies in SIV infection may further help to guide 
such selective immunogen design and to understand the 
factors defining the effectiveness of different epitopes in 
mediating relative HIV-1 control. Of note, the beneficial 
OLP identified here, 24 in clade B and 22 in clade C 
infection matched other immunogen design based on 
conserved elements in some parts as well, i.e. of the 14 
conserved elements proposed by Hanke et al, eight 
(57%) overlapped at least partly with beneficial OLP 
identified here [61]. Similarly, among the highly con- 
served elements proposed by RoUand et al [52], 35% (5/ 
14) were covered by our beneficial OLP in clade B infec- 
tion. These differences possibly emerge because the pre- 
sent analysis is based on functional T cell data rather 
than viral sequence alignments, which may not take into 
consideration epitope density and processing preferences 
of certain regions. Nevertheless, the partial overlap with 
these other immunogen design support the focus on 
conserved regions and offers the opportunity for alterna- 
tive or combined vaccine approach that elicit responses 
to regions where the virus is and possibly remains vul- 
nerable [4,46,55,62]. 

Finally, we used the extensive data set available to 
approach the question of relative effects of host genetics 
(i.e. HLA) and CTL specificity on HIV-1 control. While 
the two factors cannot be entirely disentangled, our data 
suggest that CTL specificity has an at least equal if not 
stronger effect on viral control than HLA class I allele 
expression. These findings are also in line with data by 
Mothe et al [63] showing that targeting key regions in 
p24 surrounding the dominant epitopes restricted by 
known protective alleles (KKIO for HLA-B27 and TWIO 
for HLA-B57/58) in HLA-B27, -57 or B58 negative indi- 
viduals is associated with significantly reduced viral 



loads. In addition, the presence of individuals not 
expressing known beneficial alleles in HIV-1 elite con- 
troller cohorts [64], further indicates that HIV-1 control 
is not necessarily bound to a few specific HLA class I 
alleles. A detailed study of the total HIV-1 -specific CTL 
response of subjects not expressing these alleles yet 
effectively controlling HIV-1 can be expected to provide 
further and crucially needed insight into the importance 
of targeting specific (conserved) regions of the viral gen- 
ome for HIV-1 control. Similarly, the characterization of 
functional attributes of these responses, including func- 
tional avidity and the ability to suppress in vitro viral 
replication will need to be further assessed in such indi- 
viduals. Building on experimentally derived and poten- 
tially promising immunogen sequences as defined here 
may thus provide a suitable basis for further immuno- 
gen design and iterative clinical trials in the human 
setting. 
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